What makes an image appear realistic? In this work, we are answering thisquestion from a data-driven perspective by learning the perception of visualrealism directly from large amounts of data. In particular, we train aConvolutional Neural Network (CNN) model that distinguishes natural photographsfrom automatically generated composite images. The model learns to predictvisual realism of a scene in terms of color, lighting and texturecompatibility, without any human annotations pertaining to it. Our modeloutperforms previous works that rely on hand-crafted heuristics, for the taskof classifying realistic vs. unrealistic photos. Furthermore, we apply ourlearned model to compute optimal parameters of a compositing method, tomaximize the visual realism score predicted by our CNN model. We demonstrateits advantage against existing methods via a human perception study.
展开▼